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SEQUENCE LISTING 



(1> GENERAL INFORMATION: 

(i) APPLICANT: Skatrud, Paul L. 

Peery, Robert B. 
de Waard, Maarten 

(ii) TITLE OF INVENTION: Multiple Drug Resistance Gene atrD of 
Aspergillus Nidulans 

(iii) NUMBER OF SEQUENCES: 3 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Eli Lilly and Company 

(B) STREET: Lilly Corporate Center 
<C) CITY: Indianapolis 

(D) STATE: Indiana 
<E> COUNTRY: U.S. 
{F} ZIP: 46285 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Webster, Thomas D. 

(B) REGISTRATION NUMBER: 39,872 

(C) REFERENCE/ DOCKET NUMBER: X-11766 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 317-276-3334 

(B) TELEFAX: 317-276-2763 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

<B) LOCATION: 1..4002 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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ATG TCC CCG CTA GAG ACA AAT CCC CTT TCG CCA GAG ACT GCT ATG CGC 
Met Ser Pro Leu Glu Thr Asn Pro Leu Ser Pro Glu Thr Ala Met Arg 



GCT GCG GAC GAG AAG AAA ATC CTC AGC GAC CTC TCG GCT CCA TCT AGT 
10 Ala Ala Asp Glu Lys Lys lie Leu Ser Asp Leu Ser Ala Pro Ser Ser 
35 40 45 

ACT ACA GCA ACC CCC GCA GAC AAG GAG CAC CGT CCT AAA TCG TCG TCC 
Thr Thr Ala Thr Pro Ala Asp Lys Glu His Arg Pro Lys Ser Ser Ser 
15 50 55 60 

AGC AAT AAT GCG GTC TCG GTC AAC GAA GTC GAT GCG CTT ATT GCG CAC 

Ser Asn Asn Ala Val Ser Val Asn Glu Val Asp Ala Leu lie Ala His 

65 70 75 80 

20 

CTG CCA GAA GAC GAG AGG CAG GTC TTG AAG ACG CAG CTG GAG GAG ATC 

Leu Pro Glu Asp Glu Arg Gin Val Leu Lys Thr Gin Leu Glu Glu He 

85 90 95 

25 AAA GTA AAC ATC TCC TTC TTC GGT CTC TGG CGG TAT GCA ACA AAG ATG 
Lys Val Asn He Ser Phe Phe Gly Leu Trp Arg Tyr Ala Thr Lys Met 
100 105 110 

GAT ATA CTT ATC ATG GTA ATC AGT ACA ATC TGT GCC ATT GCT GCC GCG 
30 Asp He Leu He Met Val He Ser Thr He Cys Ala He Ala Ala Ala 
115 120 125 

TCG ACT TTC CAG AGG ATA ATG TTA TAT CAA ATC TCG TAC GAC GAG TTC 
Ser Thr Phe Gin Arg He Met Leu Tyr Gin He Ser Tyr Asp Glu Phe 
35 130 135 140 

TAT GAT GAA TTG ACC AAG AAC GTA CTG TAC TTC GTA TAC CTC GGT ATC 
Tyr Asp Glu Leu Thr Lys Asn Val Leu Tyr Phe Val Tyr Leu Gly He 
145 150 155 160 

40 

GGC GAG TTT GTC ACT GTC TAT GTT AGT ACT GTT GGC TTC ATC TAT ACC 
Gly Glu Phe Val Thr Val Tyr Val Ser Thr Val Gly Phe He Tyr Thr 
165 170 175 

45 GGA GAA CAC GCC ACG CAG AAG ATC CGC GAG TAT TAC CTT GAG TCT ATC 
Gly Glu His Ala Thr Gin Lys He Arg Glu Tyr Tyr Leu Glu Ser He 
180 185 190 

CTG CGC CAG AAC ATT GGC TAT TTT GAT AAA CTC GGT GCC GGG GAA GTG 
50 Leu Arg Gin Asn He Gly Tyr Phe Asp Lys Leu Gly Ala Gly Glu Val 
195 200 205 

ACC ACC CGT ATA ACA GCC GAT ACA AAC CTT ATC CAG GAT GGC ATT TCG 
Thr Thr Arg He Thr Ala Asp Thr Asn Leu He Gin Asp Gly He Ser 
55 210 215 220 

GAG AAG GTC GGT CTC ACT TTG ACT GCC CTG GCG ACA TTC GTG ACA GCA 
Glu Lys Val Gly Leu Thr Leu Thr Ala Leu Ala Thr Phe Val Thr Ala 
225 230 235 240 

60 

TTC ATT ATC GCC TAC GTC AAA TAC TGG AAG TTG GCT CTA ATT TGC AGC 
Phe He He Ala Tyr Val Lys Tyr Trp Lys Leu Ala Leu He Cys Ser 
245 250 255 

65 TCA ACA ATT GTG GCC CTC GTT CTC ACC ATG GGC GGT GGT TCT CAG TTT 
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ATC ATC AAG TAC AGC AAA AAG TCG CTT GAC AGC TAC GGT GCA GGC GGC 

5 lie lie Lys Tyr Ser Lys Lys Eer Leu Asp Ser Tyr Gly Ala Gly Gly 
275 280 285 

ACT GTT GCG GAA GAG GTC ATC AGC TCC ATC AGA AAT GCC ACA GCG TTT 
Thr Val Ala Glu Glu Val lie Ser Ser lie Arg Asn Ala Thr Ala Phe 
10 290 295 300 

GGC ACC CAA GAC AAG CTT GCG AAG CAG TAT GAG GTC CAC TTA GAC GAA 
Gly Thr Gin Asp Lys Leu Ala Lys Gin Tyr Glu Val His Leu Asp Glu 
305 310 315 320 

15 

GCT GAG AAA TGG GGA ACA AAG AAC CAG ATT GTC ATG GGT TTC ATG ATT 
Ala Glu Lys Trp Gly Thr Lys Asn Gin lie Val Met Gly Phe Met lie 
325 330 335 

20 GGC GCC ATG TTT GGC CTT ATG TAC TCG AAC TAC GGT CTT GGC TTC TGG 

Gly Ala Met Phe Gly Leu Met Tyr Ser Asn Tyr Gly Leu Gly Phe Trp 
340 345 350 

ATG GGT TCT CGT TTC CTG GTA GAT GGT GCA GTC GAT GTG GGT GAT ATT 
25 Met Gly Ser Arg Phe Leu Val Asp Gly Ala Val Asp Val Gly Asp lie 
355 360 365 

CTC ACA GTT CTC ATG GCC ATC TTG ATC GGA TCG TTC TCC TTG GGG AAC 
Leu Thr Val Leu Met Ala lie Leu He Gly Ser Phe Ser Leu Gly Asn 
30 370 375 380 

GTT AGT CCA AAT GCT CAA GCA TTT ACA AAC GCT GTG GCC GCG GCC GCA 
Val Ser Pro Asn Ala Gin Ala Phe Thr Asn Ala Val Ala Ala Ala Ala 
385 — 390 -- 395 400 

35 

AAG ATA TTT GGA ACG ATC GAT CGC CAG TCC CCA TTA GAT CCA TAT TCG 
Lys He Phe Gly Thr He Asp Arg Gin Ser Pro Leu Asp Pro Tyr Ser 
405 410 415 

40 AAC GAA GGG AAG ACG CTC GAC CAT TTT GAG GGC CAC ATT GAG TTA CGC 
Asn Glu Gly Lys Thr Leu Asp His Phe Glu Gly His He Glu Leu Arg 
420 425 430 

AAT GTC AAG CAT ATT TAC CCA TCT AGA CCC GAG" GTC ACC GTC ATG GAG 
45 Asn Val Lys His He Tyr Pro Ser Arg Pro Glu Val Thr Val Met Glu 
435 440 445 

GAT GTT TCT CTG TCA ATG CCC GCT GGA AAA ACA ACC GCT TTA GTC GGC 
Asp Val Ser Leu Ser Met Pro Ala Gly Lys Thr Thr Ala Leu Val Gly 
50 450 455 460 

CCC TCT GGC TCT GGA AAA AGT ACG GTG GTC GGC TTG GTT GAG CGA TTC 

Pro Ser Gly Ser Gly Lys Ser Thr Val Val Gly Leu Val Glu Arg Phe 

465 470 475 480 

55 

TAC ATG CCT GTT CGC GGT ACG GTT TTG CTG GAT GGC CAT GAC ATC AAG 
Tyr Met Pro Val Arg Gly Thr Val Leu Leu Asp Gly His Asp He Lys 

485 490 495 

60 GAC CTC AAT CTC CGC TGG CTT CGC CAA CAG ATC TCT TTG GTT AGC CAG 
Asp Leu Asn Leu Arg Trp Leu Arg Gin Gin He Ser Leu Val Ser Gin 
500 505 510 

GAG CCT GTT CTT TTT GGC ACG ACG ATT TAT AAG AAT ATT AGG CAC GGT 
65 Glu Pro Val Leu Phe Gly Thr Thr He Tyr Lys Asn He Arg His Gly 
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CTC ATC GGC ACA AAG TAC GAG AAT GAA TCC GAG GAT AAG GTC CGG GAA 
Leu He Gly Thr Lys Tyr Glu Asn Glu Ser Glu Asp Lys Val Arg Glu 
5 530 535 540 

CTC ATC GAG AAC GCG GCA AAA ATG GCG AAT GCT CAT GAC TTT ATT ACT 
Leu He Glu Asn Ala Ala Lys Met Ala Asn Ala His Asp Phe He Thr 
545 550 555 560 

10 

GCC TTG CCT GAA GGT TAT GAG ACC AAT GTT GGG CAG CGT GGC TTT CTC 
Ala Leu Pro Glu Gly Tyr Glu Thr Asn Val Gly Gin Arg Gly Phe Leu 
565 570 575 

15 CTT TCA GGT GGC CAG AAA CAG CGC ATT GCA ATC GCC CGT GCC GTT GTT 
Leu Ser Gly Gly Gin Lys Gin Arg He Ala He Ala Arg Ala Val Val 
580 585 590 

AGT GAC CCA AAA ATC CTG CTC CTG GAT GAA GCT ACT TCG GCC TTG GAC 
20 Ser Asp Pro Lys He Leu Leu Leu Asp Glu Ala Thr Ser Ala Leu Asp 
595 600 605 

ACA AAA TCC GAA GGC GTG GTT CAA GCA GCT TTG GAG AGG GCA GCT GAA 
Thr Lys Ser Glu Gly Val Val Gin Ala Ala Leu Glu Arg Ala Ala Glu 
25 610 615 620 

GGC CGA ACT ACT ATT GTG ATC GCT CAT CGC CTT TCC ACG ATC AAA ACG 

Gly Arg Thr Thr He Val He Ala His Arg Leu Ser Thr He Lys Thr 

625 630 635 640 

30 

GCG CAC AAC ATT GTG GTT CTG GTC AAT GGC AAA ATT GCT GAA CAA GGA 

Ala His Asn He Val Val Leu Val Asn Gly Lys He Ala Glu Gin Gly 

645 650 655 

35 ACT CAC GAT GAA TTG GTT GAC CGC GGA GGC GCT TAT CGC AAA CTT GTG 
Thr His Asp Glu Leu Val Asp Arg Gly Gly Ala Tyr Arg Lys Leu Val 
660 665 670 

GAG GCT CAA CGT ATC AAT GAA CAG AAG GAA GCT GAC GCC TTG GAG GAC 
40 Glu Ala Gin Arg He Asn Glu Gin Lys Glu Ala Asp Ala Leu Glu Asp 
675 680 685 

GCC GAC GCT GAG GAT CTC ACG AAT GCA GAT ATT GCC AAA ATC AAA ACT 

Ala Asp Ala Glu Asp Leu Thr Asn Ala Asp He Ala Lys He Lys Thr 
45 690 695 700 

GCG TCA AGC GCA TCA TCC GAT CTC GAC GGA AAA CCC ACA ACC ATT GAC 
Ala Ser Ser Ala Ser Ser Asp Leu Asp Gly Lys Pro Thr Thr He Asp 
705 710 715 720 

50 

CGC ACG GGC ACC CAC AAG TCT GTT TCC AGC GCG ATT CTT TCT AAA AGA 
Arg Thr Gly Thr His Lys Ser Val Ser Ser Ala He Leu Ser Lys Arg 
725 730 735 

55 CCC CCC GAA ACA ACT CCG AAA TAC TCA TTA TGG ACG CTG CTC AAA TTT 
Pro Pro Glu Thr Thr Pro Lys Tyr Ser Leu Trp Thr Leu Leu Lys Phe 
740 745 750 

GTT GCT TCC TTC AAC CGC CCT GAA ATC CCG TAC ATG CTC ATC GGT CTT 
60 Val Ala Ser Phe Asn Arg Pro Glu He Pro Tyr Met Leu He Gly Leu 
755 760 765 

GTC TTC TCA GTG TTA GCT GGT GGT GGC CAA CCC ACG CAA GCA GTG CTA 
Val Phe Ser Val Leu Ala Gly Gly Gly Gin Pro Thr Gin Ala Val Leu 
65 770 775 780 
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TAT GCT AAA GCC ATC AGC ACA CTC TCG CTC CCA GAA TCA CAA TAT AGC 
Tyr Ala Lys Ala He Ser Thr Leu Ser Leu Pro Glu Ser Gin Tyr Ser 
785 790 795 800 

AAG CTT CGA CAT GAT GCG GAT TTC TGG TCA TTG ATG TTC TTC GTG GTT 
Lys Leu Arg His Asp Ala Asp Phe Trp Ser Leu Met Phe Phe Val Val 
805 810 815 

GGT ATC ATT CAG TTT ATC ACG CAG TCA ACC AAT GGT GCT GCA TTT GCC 
Gly He He Gin Phe He Thr Gin Ser Thr Asn Gly Ala Ala Phe Ala 
820 825 830 

GTA TGC TCC GAG AGA CTT ATT CGT CGC GCG AGA AGC ACT GCC TTT CGG 
Val Cys Ser Glu Arg Leu He Arg Arg Ala Arg Ser Thr Ala Phe Arg 
835 840 845 

ACG ATA CTC CGT CAA GAC ATT GCT TTC TTT GAC AAG GAA GAG AAT AGC 
Thr He Leu Arg Gin Asp He Ala Phe Phe Asp Lys Glu Glu Asn Ser 
850 855 860 

ACC GGC GCT CTG ACC TCT TTC CTG TCC ACC GAG ACG AAG CAT CTC TCC 

Thr Gly Ala Leu Thr Ser Phe Leu Ser Thr Glu Thr Lys His Leu Ser 
865 870 875 880 

GGT GTT AGC GGT GTG ACT CTA GGC ACG ATC TTG ATG ACC TCC ACG ACC 
Gly Val Ser Gly Val Thr Leu Gly Thr lie Leu Met Thr Ser Thr Thr 
885 890 895 

CTA GGA GCG GCT ATC ATT ATT GCC CTG GCG ATT GGG TGG AAA TTG GCC 
Leu Gly Ala Ala He He He Ala Leu Ala He Gly Trp Lys Leu Ala 
900 905 910 

TTA GTT TGT ATC TCG GTT GTG CCG GTT CTC CTG GCA TGC GGT TTC TAC 
Leu Val Cys He Ser Val Val Pro Val Leu Leu Ala Cys Gly Phe Tyr 
915 920 925 

CGA TTC TAT ATG CTA GCC CAG TTT CAA TCA CGC TCC AAG CTT GCT TAT 
Arg Phe Tyr Met Leu Ala Gin Phe Gin Ser Arg Ser Lys Leu Ala Tyr 
930 935 940 

GAG GGA TCT GCA AAC TTT GCT TGC GAG GCT ACA TCG TCT ATC CGC ACA 
Glu Gly Ser Ala Asn Phe Ala Cys Glu Ala Thr Ser Ser He Arg Thr 
945 950 955 960 

GTT GCG TCA TTA ACC CGG GAA AGG GAT GTC TGG GAG ATT TAC CAT GCC 
Val Ala Ser Leu Thr Arg Glu Arg Asp Val Trp Glu He Tyr His Ala 
965 970 975 

CAG CTT GAC GCA CAA GGC AGG ACC AGT CTA ATC TCT GTC TTG AGG TCA 
Gin Leu Asp Ala Gin Gly Arg Thr Ser Leu He Ser Val Leu Arg Ser 
980 985 990 

TCC CTG TTA TAT GCG TCG TCG CAG GCA CTT GTT TTC TTC TGC GTT GCG 
Ser Leu Leu Tyr Ala Ser Ser Gin Ala Leu Val Phe Phe Cys Val Ala 
995 1000 1005 

CTC GGG TTT TGG TAC GGA GGG ACA CTT CTT GGT CAC CAC GAG TAT GAC 
Leu Gly Phe Trp Tyr Gly Gly Thr Leu Leu Gly His His Glu Tyr Asp 
1010 1015 1020 

ATT TTC CGC TTC TTT GTT TGT TTC TCC GAG ATT CTC TTT GGT GCT CAA 
He Phe Arg Phe Phe Val Cys Phe Ser Glu He Leu Phe Gly Ala Gin 
1025 1030 1035 1040 
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TCC GCG GGC ACC GTC TTT TCC TTT GCA CCA GAC ATG GGC AAG GCG AAG 
Ser Ala Gly Thr Val Phe Ser Phe Ala Pro Asp Met Gly Lys Ala Lys 
1045 1050 1055 

5 AAT GCG GCC GCC GAA TTC CGA CGA CTG TTC GAC CGA AAG CCA CAA ATT 
Asn Ala Ala Ala Glu Phe Arg Arg Leu Phe Asp Arg Lys Pro Gin lie 
1060 1065 1070 

GAT AAC TGG TCT GAA GAG GGC GAG AAG CTC GAA ACG GTG GAA GGT GAA 
10 Asp Asn Trp Ser Glu Glu Gly Glu Lys Leu Glu Thr Val Glu Gly Glu 
1075 1080 1085 

ATC GAA TTT AGG AAC GTG CAC TTC AGA TAC CCG ACC CGC CCA GAA CAG 
He Glu Phe Arg Asn Val His Phe Arg Tyr Pro Thr Arg Pro Glu Gin 
15 1090 1095 1100 

CCT GTC CTG CGC GGC TTG GAC CTG ACC GTG AAG CCT GGA CAA TAT GTT 

Pro Val Leu Arg Gly Leu Asp Leu Thr Val Lys Pro Gly Gin Tyr Val 

1105 1110 1115 1120 

20 

GCG CTT GTC GGA CCC AGC GGT TGT GGC AAG AGT ACC ACC ATT GCA TTG 

Ala Leu Val Gly Pro Ser Gly Cys Gly Lys Ser Thr Thr He Ala Leu 

1125 1130 1135 

25 CTT GAG CGC TTT TAC GAT GCG ATT GCC GGG TCC ATC CTT GTT GAT GGG 
Leu Glu Arg Phe Tyr Asp Ala He Ala Gly Ser He Leu Val Asp Gly 
1140 1145 1150 

AAG GAC ATA AGT AAA CTA AAT ATC AAC TCC TAC CGC AGC TTT CTG TCA 
30 Lys Asp He Ser Lys Leu Asn He Asn Ser Tyr Arg Ser Phe Leu Ser 
1155 1160 1165 

CTG GTC AGC CAG GAG CCG ACA CTG TAC CAG GGC ACC ATC AAG GAA AAC 
Leu Val Ser Gin Glu Pro Thr Leu Tyr Gin Gly Thr He Lys Glu Asn 
35 1170 1175 1180 

ATC TTA CTT GGT ATT GTC GAA GAT GAC GTA CCG GAA GAA TTC TTG ATT 

He Leu Leu Gly He Val Glu Asp Asp Val Pro Glu Glu Phe Leu He 
1185 1190 1195 1200 

40 

AAG GCT TGC AAG GAC GCT AAT ATC TAC GAC TTC ATC ATG TCG CTC CCG 
Lys Ala Cys Lys Asp Ala Asn lie Tyr Asp Phe lie Met Ser Leu Pro 
1205 1210 1215 

45 GAG GGC TTT AAT ACA GTT GTT GGC AGC AAG GGA GGC ATG TTG TCT GGC 
Glu Gly Phe Asn Thr Val Val Gly Ser Lys Gly Gly Met Leu Ser Gly 
1220 1225 1230 

GGC CAA AAG CAA CGT GTG GCC ATT GCC CGA GCC CTT CTT CGG GAT CCC 
50 Gly Gin Lys Gin Arg Val Ala He Ala Arg Ala Leu Leu Arg Asp Pro 
1235 1240 1245 

AAA ATC CTT CTT CTC GAT GAA GCG ACG TCA GCC CTC GAC TCC GAG TCA 
Lys He Leu Leu Leu Asp Glu Ala Thr Ser Ala Leu Asp Ser Glu Ser 
55 1250 1255 1260 

GAA AAG GTC GTC CAG GCG GCT TTG GAT GCC GCT GCC CGA GGC CGA ACC 
Glu Lys Val Val Gin Ala Ala Leu Asp Ala Ala Ala Arg Gly Arg Thr 
1265 1270 1275 1280 

60 

ACA ATC GCC GTT GCA CAC CGA CTC AGC ACG ATT CAA AAG GCG GAC GTT 
Thr He Ala Val Ala His Arg Leu Ser Thr He Gin Lys Ala Asp Val 
1285 1290 1295 

65 ATC TAT GTT TTC GAC CAA GGC AAG ATC GTC GAA AGC GGA ACG CAC AGC 
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GAA CTG GTC CAG AAA AAG GGC CGG TAC TAC GAG CTG GTC AAC TTG CAG 
5 Glu Leu Val Gin Lys Lys Gly Arg Tyr Tyr Glu Leu Val Asn Leu Gin 
1315 1320 1325 

AGC TTG GGC AAG GGC CAT 
Ser Leu Gly Lys Gly His 
10 1330 

(2) INFORMATION FOR SEQ ID NO: 2: 

15 <i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1334 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (ii> MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Pro Leu Glu Thr Asn Pro Leu Ser Pro Glu Thr Ala Met Arg 
25 1 5 10 15 

Glu Pro Ala Glu Thr Ser Thr Thr Glu Glu Gin Ala Ser Thr Pro His 
20 25 30 

30 Ala Ala Asp Glu Lys Lys lie Leu Ser Asp Leu Ser Ala Pro Ser Ser 
35 40 45 

Thr Thr Ala Thr Pro Ala Asp Lys Glu His Arg Pro Lys Ser Ser Ser 
- 50 55 60 

35 

Ser Asn Asn Ala Val Ser Val Asn Glu Val Asp Ala Leu lie Ala His 
65 70 75 80 

Leu Pro Glu Asp Glu Arg Gin Val Leu Lys Thr Gin Leu Glu Glu lie 
40 85 90 95 



Tyr Asp Glu Leu Thr Lys Asn Val Leu Tyr Phe Val Tyr Leu Gly lie 
145 150 155 160 
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Glu Lys Val Gly Leu Thr Leu Thr Ala Leu Ala Thr Phe Val Thr Ala 
225 230 235 240 



Ser Thr lie Val Ala Leu Val Leu Thr Met Gly Gly Gly Ser Gin Phe 
260 265 270 



Leu Thr Val Leu Met Ala lie Leu He Gly Ser Phe Ser Leu Gly Asn 
370 375 380 

30 

Val Ser Pro Asn Ala Gin Ala Phe Thr Asn Ala Val Ala Ala Ala Ala 
385 390 395 400 



Asp Val Ser Leu Ser Met Pro Ala Gly Lys Thr Thr Ala Leu Val Gly 
450 455 460 

45 

Pro Ser Gly Ser Gly Lys Ser Thr Val Val Gly Leu Val Glu Arg Phe 
465 470 475 480 



55 Glu Pro Val Leu Phe Gly Thr Thr He Tyr Lys Asn He Arg His Gly 
515 520 525 

Leu He Gly Thr Lys Tyr Glu Asn Glu Ser Glu Asp Lys Val Arg Glu 
530 535 540 

60 

Leu He Glu Asn Ala Ala Lys Met Ala Asn Ala His Asp Phe He Thr 
545 550 555 560 
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Leu Ser Gly Gly Gin Lys Gin Arg lie Ala lie Ala Arg Ala Val Val 
580 585 590 

5 Ser Asp Pro Lys lie Leu Leu Leu Asp Glu Ala Thr Ser Ala Leu Asp 
595 600 605 

Thr Lys Ser Glu Gly Val Val Gin Ala Ala Leu Glu Arg Ala Ala Glu 
610 615 620 

10 

Gly Arg Thr Thr lie Val lie Ala His Arg Leu Ser Thr He Lys Thr 
625 630 635 640 

Ala His Asn lie Val Val Leu Val Asn Gly Lys He Ala Glu Gin Gly 
15 645 650 655 

Thr His Asp Glu Leu Val Asp Arg Gly Gly Ala Tyr Arg Lys Leu Val 
660 665 670 

20 Glu Ala Gin Arg He Asn Glu Gin Lys Glu Ala Asp Ala Leu Glu Asp 
675 680 685 

Ala Asp Ala Glu Asp Leu Thr Asn Ala Asp He Ala Lys He Lys Thr 
690 695 700 

25 

Ala Ser Ser Ala Ser Ser Asp Leu Asp Gly Lys Pro Thr Thr He Asp 
705 710 715 720 

Arg Thr Gly Thr His Lys Ser Val Ser Ser Ala He Leu Ser Lys Arg 
30 725 730 735 

Pro Pro Glu Thr Thr Pro Lys Tyr Ser Leu Trp Thr Leu Leu Lys Phe 
740 745 750 

35 Val Ala Ser Phe Asn Arg Pro Glu He Pro Tyr Met Leu He Gly Leu 
755 760 765 

Val Phe Ser Val Leu Ala Gly Gly Gly Gin Pro Thr Gin Ala Val Leu 
770 775 780 

40 

Tyr Ala Lys Ala He Ser Thr Leu Ser Leu Pro Glu Ser Gin Tyr Ser 
785 790 795 800 

Lys Leu Arg His Asp Ala Asp Phe Trp Ser Leu Met Phe Phe Val Val 
45 805 810 815 

Gly He He Gin Phe He Thr Gin Ser Thr Asn Gly Ala Ala Phe Ala 
820 825 830 

50 Val Cys Ser Glu Arg Leu He Arg Arg Ala Arg Ser Thr Ala Phe Arg 
835 840 845 

Thr He Leu Arg Gin Asp He Ala Phe Phe Asp Lys Glu Glu Asn Ser 
55 850 855 860 

Thr Gly Ala Leu Thr Ser Phe Leu Ser Thr Glu Thr Lys His Leu Ser 
865 870 875 880 

Gly Val Ser Gly Val Thr Leu Gly Thr He Leu Met Thr Ser Thr Thr 
60 885 ' 890 895 

Leu Gly Ala Ala He He He Ala Leu Ala He Gly Trp Lys Leu Ala 
900 905 910 

65 Leu Val Cys He Ser Val Val Pro Val Leu Leu Ala Cys Gly Phe Tyr 
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915 920 925 

Arg Phe Tyr Met Leu Ala Gin Phe Gin Ser Arg Ser Lys Leu Ala Tyr 
930 935 940 



Leu Gly Phe Trp Tyr Gly Gly Thr Leu Leu Gly His His Glu Tyr Asp 
1010 1015 1020 

20 

lie Phe Arg Phe Phe Val Cys Phe Ser Glu lie Leu Phe Gly Ala Gin 
1025 1030 1035 1040 



lie Glu Phe Arg Asn Val His Phe Arg Tyr Pro Thr Arg Pro Glu Gin 
1090 1095 1100 

35 

Pro Val Leu Arg Gly Leu Asp Leu Thr Val Lys Pro Gly Gin Tyr Val 
1105 1110 1115 1120 



Leu Val Ser Gin Glu Pro Thr Leu Tyr Gin Gly Thr lie Lys Glu Asn 
1170 1175 1180 

50 

He Leu Leu Gly He Val Glu Asp Asp Val Pro Glu Glu Phe Leu He 
1185 1190 1195 1200 

Lys Ala Cys Lys Asp Ala Asn He Tyr Asp Phe He Met Ser Leu Pro 
55 1205 1210 1215 



Lys He Leu Leu Leu Asp Glu Ala Thr Ser Ala Leu Asp Ser Glu Ser 
1250 1255 1260 

65 



Thr lie Ala Val Ala His Arg Leu Ser Thr He Gin Lys Ala Asp Val 
1285 1290 1295 



10 Glu Leu Val Gin Lys Lys Gly Arg Tyr Tyr Glu Leu Val Asn Leu Gin 
1315 1320 1325 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4002 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

AUGUCCCCGC UAGAGACAAA UCCCCUUUCG CCAGAGACUG CUAUGCGCGA ACCUGCUGAG 60 

ACUUCAACGA CGGAGGAGCA AGCUUCUACA CCACACGCUG CGGACGAGAA GAAAAUCCUC 120 

AGCGACCUCU CGGCUCCAUC UAGUACUACA GCAACCCCCG CAGACAAGGA GCACCGUCCU 180 

AAAUCGUCGU CCAGCAAUAA UGCGGUCUCG GUCAACGAAG UCGAUGCGCU UAUUGCGCAC 240 

CUGCCAGAAG ACGAGAGGCA GGUCUUGAAG ACGCAGCUGG AGGAGAUCAA AGUAAACAUC 300 

UCCUUCUUCG GUCUCUGGCG GUAUGCAACA AAGAUGGAUA UACUUAUCAU GGUAAUCAGU 360 

ACAAUCUGUG CCAUUGCUGC CGCGUCGACU UUCCAGAGGA UAAUGUUAUA UCAAAUCUCG 420 

UACGACGAGU UCUAUGAUGA AUUGACCAAG AACGUACUGU ACUUCGUAUA CCUCGGUAUC 480 

GGCGAGUUUG UCACUGUCUA UGUUAGUACU GUUGGCUUCA UCUAUACCGG AGAACACGCC 540 

ACGCAGAAGA UCCGCGAGUA UUACCUUGAG UCUAUCCUGC GCCAGAACAU UGGCUAUUUU 600 

GAUAAACUCG GUGCCGGGGA AGUGACCACC CGUAUAACAG CCGAUACAAA CCUUAUCCAG 660 

GAUGGCAUUU CGGAGAAGGU CGGUCUCACU UUGACUGCCC UGGCGACAUU CGUGACAGCA 720 

UUCAUUAUCG CCUACGUCAA AUACUGGAAG UUGGCUCUAA UUUGCAGCUC AACAAUUGUG 780 

GCCCUCGUUC UCACCAUGGG CGGUGGUUCU CAGUUUAUCA UCAAGUACAG CAAAAAGUCG 840 

CUUGACAGCU ACGGUGCAGG CGGCACUGUU GCGGAAGAGG UCAUCAGCUC CAUCAGAAAU 900 

GCCACAGCGU UUGGCACCCA AGACAAGCUU GCGAAGCAGU AUGAGGUCCA CUUAGACGAA 960 
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GCUGAGAAAU GGGGAACAAA 


GAACCAGAUU 


GUCAUGGGUU 


UCAUGAUUGG 


CGCCAUGUUU 




GGCCUUAUGU ACUCGAACUA 


CGGUCUUGGC 


UUCUGGAUGG 


GUUCUCGUUU 


CCUGGUAGAU 




GGUGCAGUCG 


AUGUGGGUGA 


UAUUCUCACA 


GUUCUCAUGG 


CCAUCUUGAU 


CGGAUCGUUC 


1140 


UCCUUGGGGA 


ACGUUAGUCC 


AAAUGCUCAA 


GCAUUUACAA 


ACGCUGUGGC 


CGCGGCCGCA 


1200 


AAGAUAUUUG 


GAACGAUCGA 


UCGCCAGUCC 


CCAUUAGAUC 


CAUAUUCGAA 


CGAAGGGAAG 


1260 


ACGCUCGACC 


AUUUUGAGGG 


CCACAUUGAG 


UUACGCAAUG 


UCAAGCAUAU 


UUACCCAUCU 


1320 


AGACCCGAGG 


UCACCGUCAU 


GGAGGAUGUU 


UCUCUGUCAA 


UGCCCGCUGG 


AAAAACAACC 


1380 


GCUUUAGUCG 


GCCCCUCUGG 


CUCUGGAAAA 


AGUACGGUGG 


UCGGCUUGGU 


UGAGCGAUUC 


1440 


UACAUGCCUG 


UUCGCGGUAC 


GGUUUUGCUG 


GAUGGCCAUG 


ACAUCAAGGA 


CCUCAAUCUC 


1500 


CGCUGGCUUC 


GCCAACAGAU 


CUCUUUGGUU 


AGCCAGGAGC 


CUGUUCUUUU 


UGGCACGACG 


1560 


AUUUAUAAGA AUAUUAGGCA 


CGGUCUCAUC 


GGCACAAAGU 


ACGAGAAUGA 


AUCCGAGGAU 


1620 


AAGGUCCGGG 


AACUCAUCGA 


GAACGCGGCA 


AAAAUGGCGA 


AUGCUCAUGA 


CUUUAUUACU 


1680 


GCCUUGCCUG 


AAGGUUAUGA 


GACCAAUGUU 


GGGCAGCGUG 


GCUUUCUCCU 


UUCAGGUGGC 


1740 


CAGAAACAGC 


GCAUUGCAAU 


CGCCCGUGCC 


GUUGUUAGUG 


ACCCAAAAAU 


CCUGCUCCUG 


1800 


GAUGAAGCUA 


CUUCGGCCUU 


GGACACAAAA 


UCCGAAGGCG 


UGGUUCAAGC 


AGCUUUGGAG 


1860 


AGGGCAGCUG 


AAGGCCGAAC 


UACUAUUGUG 


AUCGCUCAUC 


GCCUUUCCAC 


GAUCAAAACG 


1920 


GCGCACAACA 


UUGUGGUUCU 


GGUCAAUGGC 


AAAAUUGCUG 


AACAAGGAAC 


UCACGAUGAA 


1980 


UUGGUUGACC 


GCGGAGGCGC 


UUAUCGCAAA 


CUUGUGGAGG 


CUCAACGUAU 


CAAUGAACAG 


2040 


AAGGAAGCOG 


ACGCCUUGGA 


GGACGCCGAC 


GCUGAGGAUC 


UCACGAAUGC 


AGAUAUUGCC 


2100 


AAAAUCAAAA 


CUGCGUCAAG 


CGCAUCAUCC 


GAUCUCGACG 


GAAAACCCAC 


AACCAUUGAC 




CGCACGGGCA 


CCCACAAGUC 


UGUUUCCAGC 


GCGAUUCUUU 


CUAAAAGACC 


CCCCGAAACA 


2220 


ACUCCGAAAU ACUCAUUAUG 


GACGCUGCUC 


AAAUUUGUUG 


CUUCCUUCAA 


CCGCCCUGAA 




AUCCCGUACA 


UGCUCAUCGG 


UCUUGUCUUC 


UCAGUGUUAG 


C UGGUGGUGG 


CCAACCCACG 


2340 


CAAGCAGUGC 


UAUAUGCUAA 


AGCCAUCAGC 


ACACUCUCGC 


UCCCAGAAUC 


ACAAUAUAGC 


2400 


AAGCUUCGAC 


AUGAOGCGGA 


UUUCUGGUCA 


UUGAUGUUCU 


UCGUGGUUGG 


UAUCAUUCAG 


2460 


UUUAUCACGC 


AGUCAACCAA 


UGGUGCUGCA 


UUUGCCGUAU 


GCUCCGAGAG 


ACUUAUUCGU 


2520 


CGCGCGAGAA 


GCACUGCCUU 


UCGGACGAUA 


CUCCGUCAAG 


ACAUUGCUUU 


CUUUGACAAG 


2580 


GAAGAGAAUA GCACCGGCGC 


UCUGACCUCU 


UUCCUGUCCA 


CCGAGACGAA 


GCAUCUCUCC 


2640 


GGUGUUAGCG 


GUGUGACUCU 


AGGCACGAUC 


UUGAUGACCU 


CCACGACCCU 


AGGAGCGGCU 


2700 


AUCAUUAUUG 


CCCUGGCGAU 


UGGGUGGAAA 


UUGGCCUUAG 


UUUGUAUCUC 


GGUUGUGCCG 


2760 


GUUCUCCUGG 


CAUGCGGUUU 


CUACCGAUUC 


UAUAUGCUAG 


CCCAGUUUCA 


AUCACGCUCC 


2820 


AAGCUUGCUU AUGAGGGAUC 


UGCAAACUUU 


GCUUGCGAGG 


CUACAUCGUC 


UAUCCGCACA 


2880 
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GUUGCGUCAU UAACCCGGGA AAGG GAUGUC 
CAAGGCAGGA CCAGUCUAAU CUCUGUCUUG 
5 GCACUUGUUU UCUUCUGCGU UGCGCUCGGG 
CACGAGUAUG ACAUUUUCCG CUUCUUUGUU 
UCCGCGGGCA CCGUCUUUUC CUUUGCACCA 

10 

GAAUUCCGAC GACUGUUCGA CCGAAAGCCA 
AAGCUCGAAA CGGUGGAAGG UGAAAUCGAA 
15 CGCCCAGAAC AGCCUGUCCU GCGCGGCUUG 
GCGCUUGUCG GACCCAGCGG UUGUGGCAAG 
UACGAUGCGA UUGCCGGGUC CAUCCUUGUU 

20 

AACUCCUACC GCAGCUUUCU GUCACUGGUC 
AUCAAGGAAA ACAUCUUACU UGGUAUUGUC 
25 AAGGCUUGCA AGGACGCUAA UAUCUACGAC 
ACAGUUGUUG GCAGCAAGGG AGGCAUGUUG 
GCCCGAGCCC UUCUUCGGGA UCCCAAAAUC 

30 

GACUCCGAGU CAGAAAAGGU CGUCCAGGCG 
ACAAUCGCCG UUGCACACCG ACUCAGCACG 
35 GACCAAGGCA AGAUCGUCGA AAGCGGAACG 
UACUACGAGC UGGUCAACUU GCAGAGCUUG 



28 

UGGGAGAUUU ACCAUGCCCA GCUUGACGCA 2940 

AGGUCAUCCC UGUUAUAUGC GUCGUCGCAG 3000 

UUUUGGUACG GAGGGACACU UCUUGGUCAC 3060 

UGUUUCUCCG AGAUUCUCUU UGGUGCUCAA 3120 

GACAUGGGCA AGGCGAAGAA UGCGGCCGCC 3180 

CAAAUUGAUA ACUGGUCUGA AGAGGGCGAG 3240 

UUUAGGAACG UGCACUUCAG AUACCCGACC 3300 

GACCUGACCG UGAAGCCUGG ACAAUAUGUU 3360 

AGUACCACCA UUGCAUUGCU UGAGCGCUUU 3420 

GAUGGGAAGG ACAUAAGUAA ACUAAAUAUC 3480 

AGCCAGGAGC CGACACUGUA CCAGGGCACC 3540 

GAAGAUGACG UACCGGAAGA AUUCUUGAUU 3600 

UUCAUCAUGU CGCUCCCGGA GGGCUUUAAU 3660 

UCUGGCGGCC AAAAGCAACG UGUGGCCAUU 3720 

CinJCUUCUCG AUGAAGCGAC GUCAGCCCUC 3780 

GCUUUGGAUG CCGCUGCCCG AGGCC GAACC 3840 

AUUCAAAAGG CGGACGUUAU CUAUGUUUUC 3900 

CACAGCGAAC UGGUCCAGAA AAAGGGCCGG 3960 

GGCAAGGGCC AU 4002 



